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SUMMARY  PAGE 


THE  PROBLEM 


In  previous  investigations,  attempts  were  made  to  isolate  the  most  critical  skills  and 
procedures  within  each  stage  of  replacement  air  group  (RAG)  training  in  the  F-4  aircraft.  For 
each  of  the  stages  analyzed,  a  small  set  of  items  were  selected  on  the  basis  that  they  could  dis¬ 
criminate  among  replacement  pilots  according  to  their  final  RAG  grade.  On  the  basis  of  these 
isolated  skills,  two  fleet  evaluation  questionnaires  were  developed  to  he  used  by  operational 
K-4  squadron  commanders.  In  addition  to  ratings  on  these  two  rating  forms,  squadron  com¬ 
manders  were  asked  to  report  “critical  incidents".  These  included  such  occurrences  as  acci¬ 
dents,  incidents,  and  “wings-pulled”.  Data  obtained  from  these  two  forms  were  used  as  the 
criterion  uieasurcs  in  this  investigation, 


FINDINGS 


.  Selected  test  scores  and  flight  grades  from  undergraduate  pilot  training  were  used  as 
potential  predictors.  These  were  related  to  the  criteria  in  a  series  of  correlational  and  regression 
analyses.  A  number  of  significant  relationships  were  obtained  among  the  performance  measures. 

►  Such  results  indicated  the  method  used  in  developing  the  rating  form  to  be  a  feasible  one. 
Implications  are  discussed  in  terms  of  potential  use  for  actual  assignment  of  aviators  to  RAG 
training  in  the  F-4  aircraft.  ' 


INTRODUCTION 


The  assessment  of  pilot  performance  entails  two  distinct  problems,  criterion  definition 
and  predictor  selection,  The  definition  of  a  criterion  usually  represents  a  compromise  between 
some  hypothetical  “ideal”  measure  and  measures  which  are  readily  available.  One  of  the  major 
obstacles  to  criterion  specification  lies  in  the  diverse  nature  of  the  flight  program.  The  progress 
of  a  student  through  the  naval  aviation  training  syllabus  can  be  characterized  by  increased  dif¬ 
ferentiation  and  specificity.  It  is  only  during  the  preflight  and  primary  phases  of  training  that 
students  receive  the  same  instruction.  Following  primary  a  student  is  assigned  to  either  the  jet, 
prop,  or  hcio  pipeline  wherein  the  aircraft  to  be  flown  and  syllabus  requirements  differ  greatly. 
At  each  successive  stage,  comparability  of  training  decreases  and  consequently  the  definition  of 
a  common  criterion  of  pilot  performance  becomes  increasingly  difficult. 

The  postgraduate  phase  of  training,  or  replacement  air  group  (RAG)  t’-'dning,  is  even 
further  diversified.  Preparation  of  the  pilot  for  actual  fleet  operations  invo*1  ~s  div  'rso  aircraft 
with  completely  different  handling  characteristics  and  mission  objectives.  Furtherm  >re,  pilots 
entering  RAG  training  usually  differ  in  terms  of  previous  experience.  Some  enter  as  nugget 
pilots  directly  from  the  training  command;  some  have  had  one  or  more  tours  in  other  opera¬ 
tional  fleet  aircraft,  while  others  have  had  previous  experience  as  an  undergraduate  training 
flight  instructor.  Still  others  enter  directly  from  an  interim  shore  billet  in  which  they  have 
received  little  flight  time. 

Once  the  aviator  completes  RAG  training  and  is  assigned  to  a  fleet  squadron,  there  is 
further  diversification.  Specific  mission  objectives  and  the  operational  environment  are  unlikely 
to  be  the  same  for  different  squadrons.  Likewise,  differences  in  experience,  level  among  squad¬ 
ron  members  will  usually  exist. 

From  this  brief  description  of  naval  aviation,  it  is  readily  appaicnt  that  the  develop¬ 
ment  of  adequate  measures  of  pilot  performance  is  greatly  hindered  by  a  lack  of  commonality 
within  both  training  and  operational  commands.  The  fact  remains,  nonetheless,  that  the  pilot 
is  trained  to  become  an  integral  part  of  a  fleet  squadron.  Despite  methodological  and  practical 
difficulties,  the  hypothetical  “ideal”  criterion  must  reflect  the  manner  in  which  the  pilot  fulfills 
the  mission  objectives  of  his  aircraft  within  the  context  of  his  squadron. 

While  fleet  performance  is  the  ultimate  criterion,  a  number  of  intermediate  criteria 
may  be  defined.  Historically,  most  research  efforts  have  been  directed  toward  the  prediction  of 
intermediate  criteria  defined  at  the  undergraduate  level  of  training.  The  criterion  most  often 
adopted  has  been  the  sueecss-failurc  dichotomy,  or  simply  whether  the  student  pilot  success¬ 
fully  completes  undergraduate  training  and  receives  his  wings.  The  present  Student  Pilot  Predic¬ 
tion  System  attests  to  the  success  of  such  efforts  (1).  Recently,  the  system  has  been  extended  to 
provide  predictions  of  the  final  grade  to  aid  in  pipeline  assignment. 

At  the  R  AG  level,  Hale,  Rickus  and  Ambler  (2)  reported  that  certain  grades  obtained 
during  undergraduate  training  were  related  to  performance  defined  on  a  succcss/fail  dichotomy. 
Specifically ,  combat-related  skills  appeared  to  be  the  best  predictors.  In  a  factor  analytic  study, 
Hale,  Smith,  and  Ambler  (d)  found  that  certain  clusters  of  skills  could  be  extracted  from  stage 
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grades  obtained  during  the  undergraduate  and  postgraduate  phases  of  training.  Generally,  those 
clusters  appeared  to  be  specific  to  the  phase  of  training.  From  these  data,  a  standardized  RAG 
grading  form  was  designed  to  be  used  by  all  communities. and  is  at  present  awaiting  implemen¬ 
tation. 


At  the  fleet  level,  Booth  and  Berkshire  (4)  reported  a  factor  analysis  of  training  com- 
v;  niand  grades  and  fleet  evaluations  Different  factor  structures  were  obtained  from  samples  of 
jet  pilots  and  helicopter  pilots. 

In  assessing  the  previous  literature,  it  is  difficult  to  draw  comparisons  as  a  result  of 
differences  in  sample  group  and  time  period.  To  date,  an  extensive  investigation  of  a  single  com¬ 
munity  from  training  command  to  RAG  to  fleet  has  not  been  performed.  In  an  attempt  to  pro¬ 
vide  such  longitudinal  continuity,  a  series  of  studies  by  Shannon  and  Waag  (5,  0)  have  attempted 
to  develop  criterion  measures  of  fleet  performance  for  the  F-4  community.  In  developing  cri¬ 
teria,  two  approaches  were  taken.  The  first  involved  the  application  of  the  critical  incident 
technique  to  fleet  performance.  An  incident  w'as  said  to  occur  if  any  one  of  a  set  number  of 
operationally  defined  criterion  events  occurred.  In  this  manner,  a  dichotomous  criterion  was 
established.  Shannon  and  Waag  (7)  reported  scores  obtained  during  RAG  training  were  signifi¬ 
cantly  related  to  such  a  critical  incident  criterion.  1  .......  .  .  ..  : 

The  second  approach  focused  upon  the  isolation  of  skills  within  RAG  training  as  a 
result  of  their  similarity  to  fleet  operations.  Using  an  item  analytic  procedure,  the  most  impor¬ 
tant  skills  and  procedures  were  isolated  in  which  the  final  RAG  grade  served  as  the  criterion.  On 
die  basis  of  Uicse  isolated  skills,  two  fleet  evaluation  questionnaires  were  developed  (6,  7). 

The  present  study  investigated  the  relationship  between  performance  during  under¬ 
graduate  training  and  performance  in  the  RAG  and  fleet.  Specifically,  an  attempt  was  made  to 
predict  RAG  performance  as  measured  by  the  final  RAG  grade  and  fleet  performance  as  esti¬ 
mated  by:  (1)  critical  incidents,  and  (2)  ratings  obtained  from  squadron  commanders.  In  both 
cases,  stage  grades  earned  during  undergraduate  raining  served  as  a  potential  set  of  predictors. 

METHOD 


The  sample  group  consisted  of  173  replacement  pilots  assigned  to  RAG  training  in  the 
F-4  aircraft  between  December.  1909  and  March  1972.  All  pilots  were  designated  Category  1, 
since  they  had  never  flown  the  F-4,  and  consequently  were  required  to  complete  the  entire  RAG 
flight  training  syllabus.  Of  the  sample,  120  completed  training  at  VF-121,  the  West  Coast 
Squadron,  while  the  remaining  53  completed  training  at  VF-101,  the  Fast  Coast  Squadron.  For 
each  pilot,  the  following  information  was  obtained  and  used  as  a  potential  set  of  predictors:  (1) 
Selection  test  scores,  including  the  Aviation  Qualification  Test  (AQT)  and  the  Flight  Aptitude 
Rating  (FAR);  (2)  overall  primary  flight  grade;  (3)  overall  Basic  flight  grade;  (4)  stage  grades 
obtained  during  the  Advanced  phase  of  undergraduate  training;  and  (5)  final  RAG  grade.  Fur¬ 
thermore,  each  pilot  was  categorized  according  to  experience  level;  that  is,  whether  he  entered 
RAG  training  directly  from  the  undergraduate  training  command  or  had  previous  experience  as 
a  fleet  pilot  or  as  a  training  command  instructor. 
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Rating  forms  developed  in  previous  studies  (6,  7)  were  sent  to  the  respective  fleet 
squadron  command  for  pilots  within  the  sample.  For  the  East  Coast  sjinpr. ,  respective  squadron 
commanders  were  asked  to  rate  each  pilot  on  two  characteristics,  lleadwork  and  Basic  Airwork 
(7).  For  the  West  Coast  sample,  a  17-itcm  questionnaire  was  used  (0).  For  each  pilot,  individual 
ratings  on  each  item  were  summed  to  yield  a  total  score.  These  total  scores  were  standardized 
and  transformed  to  T-scorcs  for  each  coast  in  an  attempt  to  equate  the  two  rating  forms. 
Squadron  commanders  were  also  asked  to  report  any  critical  incidents  such  as  accidents/inci- 
dents/mishaps  attributable  to  human  error,  pulled-wings,  turned-in-wings,  etc.  For  each  pilot, 
his  final  grade  earned  during  RAG  training  was  also  obtained.  These  were  also  standardized  for 
each  coast  in  order  to  statistically  control  for  possible  differences.  Regression  analyses  using  a 
forward  selection  procedure  were  performed  in  an  attemp!  to  select  those  sets  of  variables  which 
best  predicted  each. of  tiie  criteria. 

RESULTS 


The  matrix  of  intercorrelations  from  which  the  regression  analyses  were  preformed 
is  presented  in  Appendix  A.  Of  the  total  sample,  final  RAG  grades  were  obtained  for  171  pilots. 
The  results  of  the  regression  analyses  predicting  this  criterion  are  presented  in  Table  1.  It  should 
be  pointed  out  that  the  first  variable  to  be  entered,  experience  level,  was  initially  forced  into  all 
of  the  analyses  as  a  result  of  its  moderator  effects.  Its  importance  will  be  discussed  later. 


Tabu  1 

Summary  of  Regression  Analysis  Predicting  Final  RAG  Grade 


Variables  Entered 

Cumulative  Multiple  R 

Experience  Level 

.314 

Formation 

.425 

Transition 

.458 

Flight  Aptitude  Rating 

.481 

Basic  Instruments 

.490 

Aviation  Qualification  Test 

.500 

Instrument  Navigation 

.501 

Carrier  Qualification 

.513 

From  the  potential  set  of  predictors,  the  selection  scheme  produced  a  seven  variable 
equation  yielding  a  final  multiple  R  of  .513.  Of  all  the  variables,  experience  level  was  most 
related  to  the  criterion  suggesting  that  second  tour  pilots  tend  to  receive  better  RAG  grades. 

From  Appendix  A,  it  is  apparent  that  experience  level  was  also  related  to  undergraduate  training 
measures.  This  is  not  surprising  since  assignment  following  completion  of  undergraduate  training 
is  dependent  upon  his  grades.  In  other  words,  those  replacement  pilots  who  enter  RAG  training 
directly  from  the  training  command  arc  those  receiving  the  highest  grades.  Yet  those  who  enter 
from  other  fleet  squadrons  or  instructor  duty  have  had  the  benefit  of  added  experience.  Con¬ 
sequently,  they  tend  to  perform  better  in  RAG  training,  despite  the  fact  their  grades  during 
undergraduate  training  were  lower.  It  is  this  fact  which  tends  to  reduce  the  magnitudes  of  the 
zero-order  correlations  of  the  undergraduate  measures  with  the  criterion.  For  this  reason, 
experience-level  was  always  the  first  variable  to  be  entered  in  the  regression  analyses. 


The  two  stage  grades  contributing  the  most  non-redundant  variance -to  the  explanation 
of  the  criterion. were  the  Advanced  training  stages  of  Formation  and  Transition.  Interestingly 
enough,  the  two  selection  scores,  the  AQT  and  FAR,  contained  residual  variance  negatively 
related  to  the  criterion. 


Of  the  total  sample,  fleet  ratings  by  squadron  commanding  officers  were  obtained  for 
99  pilots.  The  results  of  the  regression  analysis  predicting  this  criterion  are  presented  in  Table  2. 
A  five-variable  equation  yielding  a  multiple  R  of  .401  was  selected.  Again  the  trend  was  toward 
second  tour  pilots  receiving  higher  fleet  evaluations.  The  final  RAG  grade  as  well  as  the 
Advanced  formation  and  tactics  grades  were  all  positively  associated  with  the  criterion.  Similar 
to  the  analysis  predicting  the  final  RAG  grade,  the  Flight  Aptitude  Rating  yielded  a  residual  var¬ 
iance  negatively  associated  with  the  criterion. 


Table  2 

Summary  of  Regression  Analyti*  Predicting  Flaat  Evaluations 


Variable  Entered 

Cumulative  Multiple  R 

Experience  Laval 

.150 

Formation 

Tactics 

.367 

Final  RAG  Grade 

.390 

Flight  Aptituda  Rating 

.401 

Critical  incident  information  was  obtained  for  102  pilots.  Of  these  25,  or  24.5%  of 
the  total  were  credited  with  an  incident.  The  results  of  the  regression  analysis  predicting  this 
criterion  arc  presented  in  Table  3.  A  four-variable  equation  yielding  a  multiple  R  of  .335  was 
selected.  Of  the  variables  entered  into  the  equation,  the  final  RAG  grade  showed  most  variance 
with  the  criterion. 


Tibia  3 


Summary  of  Ragraea,on  Analysis  Predicting  Critical  Incidents 


Variable  Entered 

Cumulative  Multiple  R 

Experience  Level 

.095 

Final  RAG  Grade 

.304 

Transition 

.319 

Air-to-Ground 

.335 

DISCUSSION 


The  results  indicate  that,  for  each  criterion  measure,  a  certain  proportion  of  the  var¬ 
iance  can  bo  reliably  explained  from  a  subset  of  flic  potential  predictors.  It  is  interesting  to  note 
that  somewhat  different  predictors  emerged  for  each  criterion.  Of  the  total  sample,  both  fleet 


ratings  and  critical  incident  data  were  available  tor  90  pilots.  The  correlation  between  the  criteria 
was  found  to  be  -.400  indicating  a  negative  association  between  high  ratings  and  the  occurrence 
of  an  incident.  While  the  sign  of  the  correlation  was  in  the  expected  direction,  its  magnitude 
seemed  rather  low.  This  suggests  the  possibility  that  fleet  performance  in  the  F-4  may  be  multi¬ 
dimensional  in’Tialure.  In  other  words,  adequate  safe  performance  is  not  necessarily  the  same  as 
adequate  flight  performance. 


Consider  the  mission  objectives  of  the  F-4  aircraft.  It  was  developed  primarily  for  use 
as  a  tactical  fighter,  thereby  demanding  proficiency  in  combat-related  skills.  It  seems  likely  that 
the  fleet  ratings  best  reflect  such  skills.  Within  the  Advanced  training  phase,  the  Tactics  stage 
grade  and  Formation  grade  were  most  highly  correlated  with  the  fleet  ratings.  This  is  not  too 
surprising  since  the  most  common  tactical  configuration  in  the  F-4  involves  two  aircraft  flying 
in  a  section  formation.  In  any  case,  combat-related  skills  appear  to  represent  the  most  important 
components  of  the  fleet  ratings. 


Several  other  interesting  findings  emerged.  A  trend  was  noted  suggesting  that  pilots 
having  prior  flight  experience  as. either  a  fleet  pilot  or  training  command  instructor  tended  to 
receive  higher  fleet  ratings.  Furthermore,  the  FAR  produced  negative  relationships  with  each  of 
the  criteria.  However,  the  FAR  was  related  to  prior  experience  in  that  those  having  higher  scores 
tended  to  become  training  command  instructors.  Consequently,  those  second-tour  individuals 
having  lower  FARs  tended  to  receive  higher  ratings  as  a  possible  result  of  their  increased  exper¬ 
ience  level. 


In  summary,  the  findings  of  this  investigation  suggest:  (1)  the  filial  RAG  grade  can  be 
reliably  predicted  from  previous  flight  performance  during  undergraduate  training  ;  (2)  fleet  rat¬ 
ings  are  most  reflective  of  combat-related  skills;  (3)  critical  incidents  are  best  predicted  by  per¬ 
formance  in  the  RAG  as  estimated  by  the  final  RAG  grade;  and  (44  pilots  having  previous 
experience  tend  to  make  better  fleet  pilots  in  the  F-4  community. 
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